Búsqueda | Portal Regional de la BVS

1.

Severus: accurate detection and characterization of somatic structural variation in tumor genomes using long reads.

Keskus, Ayse; Bryant, Asher; Ahmad, Tanveer; Yoo, Byunggil; Aganezov, Sergey; Goretsky, Anton; Donmez, Ataberk; Lansdon, Lisa A; Rodriguez, Isabel; Park, Jimin; Liu, Yuelin; Cui, Xiwen; Gardner, Joshua; McNulty, Brandy; Sacco, Samuel; Shetty, Jyoti; Zhao, Yongmei; Tran, Bao; Narzisi, Giuseppe; Helland, Adrienne; Cook, Daniel E; Chang, Pi-Chuan; Kolesnikov, Alexey; Carroll, Andrew; Molloy, Erin K; Pushel, Irina; Guest, Erin; Pastinen, Tomi; Shafin, Kishwar; Miga, Karen H; Malikic, Salem; Day, Chi-Ping; Robine, Nicolas; Sahinalp, Cenk; Dean, Michael; Farooqi, Midhat S; Paten, Benedict; Kolmogorov, Mikhail.

medRxiv ; 2024 Mar 26.

Artículo en Inglés | MEDLINE | ID: mdl-38585974

RESUMEN

Most current studies rely on short-read sequencing to detect somatic structural variation (SV) in cancer genomes. Long-read sequencing offers the advantage of better mappability and long-range phasing, which results in substantial improvements in germline SV detection. However, current long-read SV detection methods do not generalize well to the analysis of somatic SVs in tumor genomes with complex rearrangements, heterogeneity, and aneuploidy. Here, we present Severus: a method for the accurate detection of different types of somatic SVs using a phased breakpoint graph approach. To benchmark various short- and long-read SV detection methods, we sequenced five tumor/normal cell line pairs with Illumina, Nanopore, and PacBio sequencing platforms; on this benchmark Severus showed the highest F1 scores (harmonic mean of the precision and recall) as compared to long-read and short-read methods. We then applied Severus to three clinical cases of pediatric cancer, demonstrating concordance with known genetic findings as well as revealing clinically relevant cryptic rearrangements missed by standard genomic panels.

2.

Dataset for Automatic Region-based Coronary Artery Disease Diagnostics Using X-Ray Angiography Images.

Popov, Maxim; Amanturdieva, Akmaral; Zhaksylyk, Nuren; Alkanov, Alsabir; Saniyazbekov, Adilbek; Aimyshev, Temirgali; Ismailov, Eldar; Bulegenov, Ablay; Kuzhukeyev, Arystan; Kulanbayeva, Aizhan; Kalzhanov, Almat; Temenov, Nurzhan; Kolesnikov, Alexey; Sakhov, Orazbek; Fazli, Siamac.

Sci Data ; 11(1): 20, 2024 Jan 03.

Artículo en Inglés | MEDLINE | ID: mdl-38172163

RESUMEN

X-ray coronary angiography is the most common tool for the diagnosis and treatment of coronary artery disease. It involves the injection of contrast agents into coronary vessels using a catheter to highlight the coronary vessel structure. Typically, multiple 2D X-ray projections are recorded from different angles to improve visualization. Recent advances in the development of deep-learning-based tools promise significant improvement in diagnosing and treating coronary artery disease. However, the limited public availability of annotated X-ray coronary angiography image datasets presents a challenge for objective assessment and comparison of existing tools and the development of novel methods. To address this challenge, we introduce a novel ARCADE dataset with 2 objectives: coronary vessel classification and stenosis detection. Each objective contains 1500 expert-labeled X-ray coronary angiography images representing: i) coronary artery segments; and ii) the locations of stenotic plaques. These datasets will serve as a benchmark for developing new methods and assessing existing approaches for the automated diagnosis and risk assessment of coronary artery disease.

Asunto(s)

Enfermedad de la Arteria Coronaria , Humanos , Catéteres , Medios de Contraste , Angiografía Coronaria/métodos , Enfermedad de la Arteria Coronaria/diagnóstico por imagen , Rayos X

3.

Local read haplotagging enables accurate long-read small variant calling.

Kolesnikov, Alexey; Cook, Daniel; Nattestad, Maria; McNulty, Brandy; Gorzynski, John; Goenka, Sneha; Ashley, Euan A; Jain, Miten; Miga, Karen H; Paten, Benedict; Chang, Pi-Chuan; Carroll, Andrew; Shafin, Kishwar.

bioRxiv ; 2023 Sep 12.

Artículo en Inglés | MEDLINE | ID: mdl-37745389

RESUMEN

Long-read sequencing technology has enabled variant detection in difficult-to-map regions of the genome and enabled rapid genetic diagnosis in clinical settings. Rapidly evolving third-generation sequencing platforms like Pacific Biosciences (PacBio) and Oxford nanopore technologies (ONT) are introducing newer platforms and data types. It has been demonstrated that variant calling methods based on deep neural networks can use local haplotyping information with long-reads to improve the genotyping accuracy. However, using local haplotype information creates an overhead as variant calling needs to be performed multiple times which ultimately makes it difficult to extend to new data types and platforms as they get introduced. In this work, we have developed a local haplotype approximate method that enables state-of-the-art variant calling performance with multiple sequencing platforms including PacBio Revio system, ONT R10.4 simplex and duplex data. This addition of local haplotype approximation makes DeepVariant a universal variant calling solution for long-read sequencing platforms.

4.

Improving variant calling using population data and deep learning.

Chen, Nae-Chyun; Kolesnikov, Alexey; Goel, Sidharth; Yun, Taedong; Chang, Pi-Chuan; Carroll, Andrew.

BMC Bioinformatics ; 24(1): 197, 2023 May 12.

Artículo en Inglés | MEDLINE | ID: mdl-37173615

RESUMEN

Large-scale population variant data is often used to filter and aid interpretation of variant calls in a single sample. These approaches do not incorporate population information directly into the process of variant calling, and are often limited to filtering which trades recall for precision. In this study, we develop population-aware DeepVariant models with a new channel encoding allele frequencies from the 1000 Genomes Project. This model reduces variant calling errors, improving both precision and recall in single samples, and reduces rare homozygous and pathogenic clinvar calls cohort-wide. We assess the use of population-specific or diverse reference panels, finding the greatest accuracy with diverse panels, suggesting that large, diverse panels are preferable to individual populations, even when the population matches sample ancestry. Finally, we show that this benefit generalizes to samples with different ancestry from the training data even when the ancestry is also excluded from the reference panel.

Asunto(s)

Aprendizaje Profundo , Humanos , Frecuencia de los Genes , Secuenciación Completa del Genoma , Estudio de Asociación del Genoma Completo , Genoma Humano , Polimorfismo de Nucleótido Simple , Secuenciación de Nucleótidos de Alto Rendimiento

5.

DeepConsensus improves the accuracy of sequences with a gap-aware sequence transformer.

Baid, Gunjan; Cook, Daniel E; Shafin, Kishwar; Yun, Taedong; Llinares-López, Felipe; Berthet, Quentin; Belyaeva, Anastasiya; Töpfer, Armin; Wenger, Aaron M; Rowell, William J; Yang, Howard; Kolesnikov, Alexey; Ammar, Waleed; Vert, Jean-Philippe; Vaswani, Ashish; McLean, Cory Y; Nattestad, Maria; Chang, Pi-Chuan; Carroll, Andrew.

Nat Biotechnol ; 41(2): 232-238, 2023 02.

Artículo en Inglés | MEDLINE | ID: mdl-36050551

RESUMEN

Circular consensus sequencing with Pacific Biosciences (PacBio) technology generates long (10-25 kilobases), accurate 'HiFi' reads by combining serial observations of a DNA molecule into a consensus sequence. The standard approach to consensus generation, pbccs, uses a hidden Markov model. We introduce DeepConsensus, which uses an alignment-based loss to train a gap-aware transformer-encoder for sequence correction. Compared to pbccs, DeepConsensus reduces read errors by 42%. This increases the yield of PacBio HiFi reads at Q20 by 9%, at Q30 by 27% and at Q40 by 90%. With two SMRT Cells of HG003, reads from DeepConsensus improve hifiasm assembly contiguity (ï»¿NG50 4.9 megabases (Mb) to 17.2 Mb), increase gene completeness (94% to 97%), reduce the false gene duplication rate (1.1% to 0.5%), improve assembly base accuracy (Q43 to Q45) and reduce variant-calling errors by 24%. DeepConsensus models could be trained to the general problem of analyzing the alignment of other types of sequences, such as unique molecular identifiers or genome assemblies.

Asunto(s)

Secuenciación de Nucleótidos de Alto Rendimiento , Análisis de Secuencia de ADN

6.

The association between gambling marketing and unplanned gambling spend: Synthesised findings from two online cross-sectional surveys.

Wardle, Heather; Critchlow, Nathan; Brown, Ashley; Donnachie, Craig; Kolesnikov, Alexey; Hunt, Kate.

Addict Behav ; 135: 107440, 2022 12.

Artículo en Inglés | MEDLINE | ID: mdl-35973384

RESUMEN

BACKGROUND: In 2020, the British Government initiated a review about whether to introduce stricter controls on gambling marketing. We examine: (i) what proportion of regular sports bettors and emergent adult gamblers report that marketing has prompted unplanned spend; and (ii) what factors are associated with reporting that marketing had prompted unplanned spend. METHODS: Data are from two British non-probability online surveys with: (i) emerging adults (16-24 years; n = 3,549; July/August 2019) and (ii) regular sports bettors (18+; n = 3,195; November 2020). Among current gamblers, logistic regressions examined whether reporting that gambling marketing had prompted unplanned spend (vs never) was associated with past-month marketing awareness, past-month receipt of direct marketing (e.g., e-mails), following gambling brands on social media, and problem gambling classification. RESULTS: Almost a third of current gamblers reported that marketing had prompted unplanned gambling spend (sports bettors: 31.2 %; emerging adults: 29.5 %). Escalated severity of problem gambling was associated with reporting that marketing had prompted unplanned spend in both samples, in particular those experiencing gambling problems compared to those experiencing no problems (sports bettors: ORAdj = 17.01, 95 % CI: 10.61-27.27; emerging adults: ORAdj = 11.67, 95 % CI: 6.43-21.12). Receipt of least one form of direct marketing in the past month and following a gambling brand on at least one social media platform was also associated unplanned spend among sports bettors and emerging adults. CONCLUSION: Among emerging adults and regular sports bettors, increased severity of gambling problems, receiving direct marketing, and following gambling brands on social media are associated with reporting that marketing has prompted unplanned spend.

Asunto(s)

Juego de Azar , Deportes , Adulto , Estudios Transversales , Juego de Azar/epidemiología , Humanos , Mercadotecnía , Encuestas y Cuestionarios

7.

PrecisionFDA Truth Challenge V2: Calling variants from short and long reads in difficult-to-map regions.

Olson, Nathan D; Wagner, Justin; McDaniel, Jennifer; Stephens, Sarah H; Westreich, Samuel T; Prasanna, Anish G; Johanson, Elaine; Boja, Emily; Maier, Ezekiel J; Serang, Omar; Jáspez, David; Lorenzo-Salazar, José M; Muñoz-Barrera, Adrián; Rubio-Rodríguez, Luis A; Flores, Carlos; Kyriakidis, Konstantinos; Malousi, Andigoni; Shafin, Kishwar; Pesout, Trevor; Jain, Miten; Paten, Benedict; Chang, Pi-Chuan; Kolesnikov, Alexey; Nattestad, Maria; Baid, Gunjan; Goel, Sidharth; Yang, Howard; Carroll, Andrew; Eveleigh, Robert; Bourgey, Mathieu; Bourque, Guillaume; Li, Gen; Ma, ChouXian; Tang, LinQi; Du, YuanPing; Zhang, ShaoWei; Morata, Jordi; Tonda, Raúl; Parra, Genís; Trotta, Jean-Rémi; Brueffer, Christian; Demirkaya-Budak, Sinem; Kabakci-Zorlu, Duygu; Turgut, Deniz; Kalay, Özem; Budak, Gungor; Narci, Kübra; Arslan, Elif; Brown, Richard; Johnson, Ivan J.

Cell Genom ; 2(5)2022 May 11.

Artículo en Inglés | MEDLINE | ID: mdl-35720974

RESUMEN

The precisionFDA Truth Challenge V2 aimed to assess the state of the art of variant calling in challenging genomic regions. Starting with FASTQs, 20 challenge participants applied their variant-calling pipelines and submitted 64 variant call sets for one or more sequencing technologies (Illumina, PacBio HiFi, and Oxford Nanopore Technologies). Submissions were evaluated following best practices for benchmarking small variants with updated Genome in a Bottle benchmark sets and genome stratifications. Challenge submissions included numerous innovative methods, with graph-based and machine learning methods scoring best for short-read and long-read datasets, respectively. With machine learning approaches, combining multiple sequencing technologies performed particularly well. Recent developments in sequencing and variant calling have enabled benchmarking variants in challenging genomic regions, paving the way for the identification of previously unknown clinically relevant variants.

8.

Accelerated identification of disease-causing variants with ultra-rapid nanopore genome sequencing.

Goenka, Sneha D; Gorzynski, John E; Shafin, Kishwar; Fisk, Dianna G; Pesout, Trevor; Jensen, Tanner D; Monlong, Jean; Chang, Pi-Chuan; Baid, Gunjan; Bernstein, Jonathan A; Christle, Jeffrey W; Dalton, Karen P; Garalde, Daniel R; Grove, Megan E; Guillory, Joseph; Kolesnikov, Alexey; Nattestad, Maria; Ruzhnikov, Maura R Z; Samadi, Mehrzad; Sethia, Ankit; Spiteri, Elizabeth; Wright, Christopher J; Xiong, Katherine; Zhu, Tong; Jain, Miten; Sedlazeck, Fritz J; Carroll, Andrew; Paten, Benedict; Ashley, Euan A.

Nat Biotechnol ; 40(7): 1035-1041, 2022 07.

Artículo en Inglés | MEDLINE | ID: mdl-35347328

RESUMEN

Whole-genome sequencing (WGS) can identify variants that cause genetic disease, but the time required for sequencing and analysis has been a barrier to its use in acutely ill patients. In the present study, we develop an approach for ultra-rapid nanopore WGS that combines an optimized sample preparation protocol, distributing sequencing over 48 flow cells, near real-time base calling and alignment, accelerated variant calling and fast variant filtration for efficient manual review. Application to two example clinical cases identified a candidate variant in <8 h from sample preparation to variant identification. We show that this framework provides accurate variant calls and efficient prioritization, and accelerates diagnostic clinical genome sequencing twofold compared with previous approaches.

Asunto(s)

Secuenciación de Nanoporos , Nanoporos , Mapeo Cromosómico , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Humanos , Secuenciación Completa del Genoma/métodos

9.

Ultrarapid Nanopore Genome Sequencing in a Critical Care Setting.

Gorzynski, John E; Goenka, Sneha D; Shafin, Kishwar; Jensen, Tanner D; Fisk, Dianna G; Grove, Megan E; Spiteri, Elizabeth; Pesout, Trevor; Monlong, Jean; Baid, Gunjan; Bernstein, Jonathan A; Ceresnak, Scott; Chang, Pi-Chuan; Christle, Jeffrey W; Chubb, Henry; Dalton, Karen P; Dunn, Kyla; Garalde, Daniel R; Guillory, Joseph; Knowles, Joshua W; Kolesnikov, Alexey; Ma, Michael; Moscarello, Tia; Nattestad, Maria; Perez, Marco; Ruzhnikov, Maura R Z; Samadi, Mehrzad; Setia, Ankit; Wright, Chris; Wusthoff, Courtney J; Xiong, Katherine; Zhu, Tong; Jain, Miten; Sedlazeck, Fritz J; Carroll, Andrew; Paten, Benedict; Ashley, Euan A.

N Engl J Med ; 386(7): 700-702, 2022 02 17.

Artículo en Inglés | MEDLINE | ID: mdl-35020984

Asunto(s)

Cuidados Críticos , Secuenciación de Nanoporos/métodos , Trastornos del Neurodesarrollo/diagnóstico , Adolescente , Preescolar , Femenino , Humanos , Lactante , Recién Nacido , Masculino , Persona de Mediana Edad , Mutación , Secuenciación de Nanoporos/economía , Trastornos del Neurodesarrollo/genética , Análisis de Secuencia de ADN/métodos , Estado Epiléptico/genética

10.

Haplotype-aware variant calling with PEPPER-Margin-DeepVariant enables high accuracy in nanopore long-reads.

Shafin, Kishwar; Pesout, Trevor; Chang, Pi-Chuan; Nattestad, Maria; Kolesnikov, Alexey; Goel, Sidharth; Baid, Gunjan; Kolmogorov, Mikhail; Eizenga, Jordan M; Miga, Karen H; Carnevali, Paolo; Jain, Miten; Carroll, Andrew; Paten, Benedict.

Nat Methods ; 18(11): 1322-1332, 2021 11.

Artículo en Inglés | MEDLINE | ID: mdl-34725481

RESUMEN

Long-read sequencing has the potential to transform variant detection by reaching currently difficult-to-map regions and routinely linking together adjacent variations to enable read-based phasing. Third-generation nanopore sequence data have demonstrated a long read length, but current interpretation methods for their novel pore-based signal have unique error profiles, making accurate analysis challenging. Here, we introduce a haplotype-aware variant calling pipeline, PEPPER-Margin-DeepVariant, that produces state-of-the-art variant calling results with nanopore data. We show that our nanopore-based method outperforms the short-read-based single-nucleotide-variant identification method at the whole-genome scale and produces high-quality single-nucleotide variants in segmental duplications and low-mappability regions where short-read-based genotyping fails. We show that our pipeline can provide highly contiguous phase blocks across the genome with nanopore reads, contiguously spanning between 85% and 92% of annotated genes across six samples. We also extend PEPPER-Margin-DeepVariant to PacBio HiFi data, providing an efficient solution with superior performance over the current WhatsHap-DeepVariant standard. Finally, we demonstrate de novo assembly polishing methods that use nanopore and PacBio HiFi reads to produce diploid assemblies with high accuracy (Q35+ nanopore-polished and Q40+ PacBio HiFi-polished).

Asunto(s)

Genes , Haplotipos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Nanoporos , Polimorfismo de Nucleótido Simple , Análisis de Secuencia de ADN/métodos , Programas Informáticos , Genoma Humano , Humanos , Anotación de Secuencia Molecular

11.

A population-specific reference panel for improved genotype imputation in African Americans.

O'Connell, Jared; Yun, Taedong; Moreno, Meghan; Li, Helen; Litterman, Nadia; Kolesnikov, Alexey; Noblin, Elizabeth; Chang, Pi-Chuan; Shastri, Anjali; Dorfman, Elizabeth H; Shringarpure, Suyash; Auton, Adam; Carroll, Andrew; McLean, Cory Y.

Commun Biol ; 4(1): 1269, 2021 11 05.

Artículo en Inglés | MEDLINE | ID: mdl-34741098

RESUMEN

There is currently a dearth of accessible whole genome sequencing (WGS) data for individuals residing in the Americas with Sub-Saharan African ancestry. We generated whole genome sequencing data at intermediate (15×) coverage for 2,294 individuals with large amounts of Sub-Saharan African ancestry, predominantly Atlantic African admixed with varying amounts of European and American ancestry. We performed extensive comparisons of variant callers, phasing algorithms, and variant filtration on these data to construct a high quality imputation panel containing data from 2,269 unrelated individuals. With the exception of the TOPMed imputation server (which notably cannot be downloaded), our panel substantially outperformed other available panels when imputing African American individuals. The raw sequencing data, variant calls and imputation panel for this cohort are all freely available via dbGaP and should prove an invaluable resource for further study of admixed African genetics.

Asunto(s)

Genoma Humano , Genotipo , Adulto , Negro o Afroamericano , Anciano , Anciano de 80 o más Años , Humanos , Persona de Mediana Edad , Estados Unidos , Secuenciación Completa del Genoma , Adulto Joven

12.

A Rare Case of Takayasu Arteritis With Total Left Main Coronary Artery Occlusion.

Rakisheva, Amina; Mashkunova, Olga; Kolesnikov, Alexey; Urazalina, Saule; Mussagaliyeva, Aisulu.

JACC Case Rep ; 2(2): 312-313, 2020 Feb.

Artículo en Inglés | MEDLINE | ID: mdl-34317230

RESUMEN

We report the case of a young woman with chest pain and recurrent abortion. The patient was found to have Takayasu arteritis. Drug therapy was started, and emergency bypass surgery was performed. The case showed the possible clinical manifestation of vasculitis as a recurrent abortion, followed by total occlusion of the left main coronary artery. (Level of Difficulty: Intermediate.).

13.

Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome.

Wenger, Aaron M; Peluso, Paul; Rowell, William J; Chang, Pi-Chuan; Hall, Richard J; Concepcion, Gregory T; Ebler, Jana; Fungtammasan, Arkarachai; Kolesnikov, Alexey; Olson, Nathan D; Töpfer, Armin; Alonge, Michael; Mahmoud, Medhat; Qian, Yufeng; Chin, Chen-Shan; Phillippy, Adam M; Schatz, Michael C; Myers, Gene; DePristo, Mark A; Ruan, Jue; Marschall, Tobias; Sedlazeck, Fritz J; Zook, Justin M; Li, Heng; Koren, Sergey; Carroll, Andrew; Rank, David R; Hunkapiller, Michael W.

Nat Biotechnol ; 37(10): 1155-1162, 2019 10.

Artículo en Inglés | MEDLINE | ID: mdl-31406327

RESUMEN

The DNA sequencing technologies in use today produce either highly accurate short reads or less-accurate long reads. We report the optimization of circular consensus sequencing (CCS) to improve the accuracy of single-molecule real-time (SMRT) sequencing (PacBio) and generate highly accurate (99.8%) long high-fidelity (HiFi) reads with an average length of 13.5 kilobases (kb). We applied our approach to sequence the well-characterized human HG002/NA24385 genome and obtained precision and recall rates of at least 99.91% for single-nucleotide variants (SNVs), 95.98% for insertions and deletions <50 bp (indels) and 95.99% for structural variants. Our CCS method matches or exceeds the ability of short-read sequencing to detect small variants and structural variants. We estimate that 2,434 discordances are correctable mistakes in the 'genome in a bottle' (GIAB) benchmark set. Nearly all (99.64%) variants can be phased into haplotypes, further improving variant detection. De novo genome assembly using CCS reads alone produced a contiguous and accurate genome with a contig N50 of >15 megabases (Mb) and concordance of 99.997%, substantially outperforming assembly with less-accurate long reads.

Asunto(s)

ADN Circular/genética , Genoma Humano , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Análisis de Secuencia de ADN/métodos , Secuencia de Bases , Variación Genética , Haplotipos , Humanos

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA